
This collection contains close to 1M tweets from Russian government agencies, embassies and other government institutions harvested from 2017-02-13 until 2022-07-26. Tweet dates span from 2012 to 2022.

The meta folder contains harvesting and seed metadata.

* Seeds used in this collection are listed in the meta/seeds.json file.
* Harvests runs are documented in meta/harvests.json (please note that harvests may have been paused for certain time periods).

Data files are arranged by collection year and data has been flattened to the CSV format containing the columns created_at (tweet date), id (tweet id), user_id (twitter user id), user_screen_name (screen_name) and full_text (tweet text).

2017.csv  88M  
2018.csv  37M  
2019.csv  43M  
2020.csv  44M  
2021.csv   5M  
2022.csv  48M  

If you need access to the high resolution json data containing the full contents of the tweet use twarc to hydrate from the tweet ids in the CSV files. A short guide is here: https://twarc-project.readthedocs.io/en/latest/tutorial/

Note that the hydrated files will include fewer tweets as it will not contain tweets that have been deleted, or tweets by accounts that have been deleted, suspended, or protected. If you need access to raw collected data containing everything, please contact me.

This data was collected using Social Feed manager by George Washington University Libraries. (2016). Social Feed Manager. Zenodo. https://doi.org/10.5281/zenodo.597278